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Abstract. We define a hierarchical clustering method: a-unchaining single 
linkage or SL(a). The input of this algorithm is a finite metric space and a 
certain parameter a. This method is sensitive to the density of the distribu- 
tion and offers some solution to the so called chaining effect. We study the 
theoretical properties of this method and offer some theoretical background 
for the treatment of the chaining effect. 



A standard clustering method is an algorithm that takes as input a finite metric 
space (X, d) and gives as output a partition of X. 

Kleinberg discussed in [5] the problem of clustering in an axiomatic way and 
proposed a few basic properties that a clustering scheme should hold. Let V{X) 
denote the set of all possible partitions of X. Fix a clustering method / so that 
f(X) = IIe V(X). The properties proposed by Kelinberg were: 

• Scale invariance: For all a > 0, f(X, a ■ d) = IT 

• Richness: Given a finite set X, for every IT £ V[X) there exists a metric 
dn on X such that f(X, dn) = II. 

• Consistency: Let II = {£?i, B n }. Let d! be any metric on X such that 

1) for all x,x' £ B i: d'(x,x') < d(x,x') and 

2) for all x £ B l . x' £ Bj, j, d'(x. x 1 ) > d(x, x 1 ). 



The author would like to express his gratitude to Bruce Hughes for pointing out such a nice 
paper to read. 
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Then, f(X,d') = n. 

Then, he proved that no standard clustering scheme satisfying this conditions 
simultaneously can exist. 

Carlsson and Memoli study in [4] the analogous problem for clustering schemes 
that yield hierachical decompositions instead of a certain partition of the space. See 
also [5] and [3]. Hierarchical clustering methods also take as input a finite metric 
space but the output is a hierarchical family of partitions of X. 

Carlsson and Memoli approach the subject focusing on a theoretical basis for the 
study of hierarchical clustering (HC). In the spirit of Kleinberg's result, they define 
a few reasonable conditions that a HC method should hold. Then, they obtain the 
uniqueness result stated below as Theorem |4.2| In this result, they prove that the 
unique HC method holding three basic conditions is (the well-known) single linkage 
hierarchical clustering, SL HC. 

Also, they study the theoretical properties of SL HC obtaining some interesting 
results. The main advantage seems to be that this method enjoys some sort of 
stability which is defined by means of the Gromov-Hausdorff distance. However, 
the main weakness of SL HC is the so called chaining effect which may merge 
clusters that, in practice, should be detected by the algorithm and kept separated. 
One way to adress this dificulty is to take acount of the density. In [5] the same 
authors do this by including in the input of the algorithm a function that provides 
that information. 

Our aim is to define a HC method which offers some solution to this weakness 
without including any extra information. The first challenge is that the concept 
of clusters that should be detected by the algorithm depends on the characteristics 
of the problem under study. The same happens with what we may consider the 
undesired chaining effect. The definition from [5] makes reference to the higher 
tendency of the points to add to a pre-existing group rather than defining the 
nucleus of a new group or joining to another single point. Our algorithm is oriented 
to another aspect of the chaining effect which is the tendency to merge two clusters 
when the minimal distance between them is small even though they may have dense 
cores which are clearly distant appart. This is typically the problem of SL HC. 
Also, we include as an undesired chaining effect the case of two big clusters joined 
by a chain of points or small clusters. This isolated points or small clusters may be 
interpreted as noise in the sample and we might want to distinguish the big picture 
and ignore their effect. 

There exist other methods that enjoy some sort of sensitivity to density and 
offer some resistence to these chaining effects as average linkage, AL, or complete 
linkage, CL. These methods are extensively used in practice. However, although 
the main problem of the chaining effect of SL HC is reduced, appears a tendecy to 
merge isolated points before joining them to a pre-existing cluster and this might 
be unwanted too. Also, these methods are proved to be extremely unstable in the 
sense of Carlson and Memoli, this is, small perturbations on the data yield very 
different dendrograms. 

Herein, we define a new HC method on the basis of SL: a-unchaining single 
linkage or SL(a). The definition of SL{a) is based in the dimension of the flag 
complexes defined by the points of X. These flag complexes contain some infor- 
mation about the density distribution of the sample. This approach allows us to 
define a density sensitive algorithm such that the input is just the set of distances 
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between the points and a fixed parameter a G N. This parameter determines how 
sensitive the method will be to the chaining effect. 

Our intention is to give a theoretical basis to the study of the problem. So, 
instead of checking the algorithm on examples of real data we rather try to find 
general properties characterizing what would be an undesired chaining of two blocks 
and how good is the algorithm detecting and unchaining them. 

The first question is how to define the chaining effect. Since the precise effect we 
would like to avoid may depend on the problem we are dealing with we didn't try to 
give any specific definition which may be useless in most situations. Instead of doing 
that, we tried to give some general frame to the problem. We define the concept 
of chained subsets and subsets chained by smaller blocks as situations of 
minimal chaining so that they contain what we consider the problematic examples. 
Nevertheless, there may be many examples of chained subsets which should be 
clearly merged and there is margin to be more restrictive. In such context, a H C 
method is strongly chaining if every pair of chained subsets are always merged 
before they appear contained in different clusters. A HC method is completely 
chaining if, in addition, every pair of subsets chained through smaller blocks are 
merged before they appear contained in different clusters. Thus, strongly chaining 
methods and completely chaining methods are extremely sensitive to these effects. 
This is the case, for example, of SL HC. See Theorem |5.8| 

We define also precise conditions to define what we consider two blocks that 
should necessarily appear as independent blocks at some point. The definition con- 
siders two blocks that have small dense cores and such that the distances between 
their respective points is bigger than their diameter except from a single pair of 
points. This pair of points creates the chaining between their cores. This is a par- 
ticular, more restrictive, example of chained subsets. We say that a HC method is 
weakly unchaining if, at least, it distinguishes that pair of blocks. Then, we prove 
that SL(a) holds this condition while other methods which are not strongly chain- 
ing as AL and CL HC fail to be weakly unchaining. See Theorem |8.3| Corollary 
|8.4| and Example |8.5| 

We also define a minimal condition of two subsets chained by single points that 
should be detected. We say that a HC method is a- unchaining if it is able to 
separate two blocks in that situation. SL(a) is proved to be more sensitive than 
that. It also detects some classes of chaining through smaller blocks as it is proved 
in Proposition |8.8| In particular, SL(a) is a-unchaining. See Corollary 8.9 

The structure of the paper is the following: 

Section [2] contains the basic definitions and notation involved. It may be skept 
by the experts. 

In Section[3]we recall some well known hierachical clustering methods and include 
some different ways to formulate them. 

In Section [4] we recall the axiomatic characterization of SL HC from [3] . We 
discuss the basic properties and find an alternative (related) characterization. 

In Section [5] we treat the problem of the chaining effect. Here, we define the 
concept of chained subsets and subsets chained through smaller blocks. We define 
the property of being strongly chaining and completely chaining for HC methods 
which are extremely sensitive to the chaining effect. We prove that SL is strongly 
chaining and completely chaining while AL and CL HC are not. 
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In Section [6] we introduce SL(a). We include a short explanation of the role of 
each step of the method and check it on a few examples. 

Section [7] studies the general properties of SL(a) and proves that it shares many 
basic properties with SL. In particular, we see that this method is permutation 
invariant, it has richness property and it leaves invariant ultrametric spaces. We 
also see that SL(a) is not stable in the Gromov-Hausdorff sense. 

Section [8] studies the unchaining properties of SL(a). In particular, we define 
the concepts of weakly unchaining and a-unchaining. We prove that SL(a) is 
weakly unchaining and a-unchaining while other methods, partially sensitive to 
the chaining effect as AL and CL, are not. 

2. Background and notation 

A dendrogram over a finite set is a nested familly of partitions. This is usually 
represented as a rooted tree. 

Let V(X) denote the collection of all partitions of a finite set X — {x\, x n }. 
Then, a dendrogram can also be described as a map 9: [0, oo) — > V(X) such that: 

1. 6(0) = {{ Xl },{x 2 },...,{x n }}, 

2. there exists T such that 9{t) — X for every t > T, 

3. if r < s then 9(r) refines 9(s), 

4. for all r there exists e > such that 9{r) = 9(t) for t £ [r,r + e]. 

Notice that conditions 2 and 4 imply that there exist to < t\ < ... < t m such 
that 9(r) = 0(£j_i) for every r € [i,_i,£j), i = l,m and 9(r) = 9(t m ) = {X} for 
every re [t m ,oo). 

For any partition {B\, ...,Bk} € 'P(X), the subsets Bi are called blocks. 

Let T>{X) denote the collection of all possible dendrograms overe a finite set X. 
Given some 9 £ T>{X), let us denote 6[t) — {B[, ...,B^, t -.}. Therefore, the nested 
familly of partitions is given by the corresponding partitions at to,...,t m , this is, 
{B[\ £fc( ti )} i = Q,m. 

An ultrametric space is a metric space (X, d) such that d(x, y) < max{d(a;, z), d(z, y)} 
for all x,y,z G X. Given a finite metric space X let U(X) denote the set of all 
ultrametrics over X. 

There is a well known equivalence between trees and ultrametrics. See [7] or |10| 
for a complete exposition of how to build categorical equivalences between them. 
In particular, this may be translated into an equivalence between dendrograms and 
ultrametrics: 

Let us define the functor n: T>(X) — > U(X) as follows: 

Given a dendrogram 9 <= T>(X), let rj{9) — ug such that ug(x,x') — min{r > 
| x, x' belong to the same block of 0(r)}. 

Proposition 2.1. [4, Theorem 9] n: T>(X) —*-U(X) is a bisection. 

3. Hierarchical clustering methods 

Let us recall the definition of some hierarchical clustering methods. We include 
here the description of single linkage by its i-connected components and the re- 
cursive description of single linkage, complete linkage and average linkage. We 
introduce also an alternative description of these methods, based in the previous 
one, defining the graph G £ R . This graph will be the key to build our method, SL(a), 
and might be useful to define other algorithms better adapted to specific problems. 
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An s-chain is a finite sequence of points xq, Xn that are separated by distances 
less or equal than e: |xj— Xj+i| < £. Two points are e-connected if there is an e-chain 
joining them. Any two points in an e-connected set can be linked by an e-chain. 
An s- component is a maximal e-connected subset. 

Clearly, given a metric space and any e > 0, there is a partition of X in its 
e-components {Cf, C|( e )}< 

Let A be a finite metric set. The single linkage HC is defined by the map 
9: [0, oo ) — > V{X) such that 8(t) is the partition of X in its ^-components. 

In [4], there is also an alternative formulation of SL HC. They use a recursive 
procedure by which they redefine SL HC, average linkage (AL) and complete 
linkage (CL) hierarchical clustering. The main advantage of this procedure is that 
it allows to merge more than two clusters at the same time. Therefore, AL and CL 
HC can be made permutation invariant (see definition below). We reproduce here, 
for completeness, their formulation. 

Let (X,d) be a finite metric space where X = {x\, x n } and let L denote a 
family of linkage functions on X: 

L := {£: C(X) x C(X) 1R+ | £ is bounded and non-negative } 

where C(X) denotes the collection of all non-empty subsets of X. 
Some standard choices for I are: 

• Single linkage: t (B,B') = ^ocm^( x ,x')eBxB' d(x,x') 

• Complete linkage: £ CL (B,B') = max( x x tj eBxB , d(x,x') 

• Average linkage: £ AL (B,B') = ^^fj^^^ } where #(A) denotes the 
cardinal of the set X. 

Fix some linkage function I £ L. Then, the recursive formulation is as follows 

1. For each R > consider the equivalence relation r on blocks of a par- 
tition II 6 V(X), given by B <^e,R B' if and only if there is a sequence of 
blocks B = B X , ...,B S = B' in II with £(B k ,B k+1 ) < R for k = 1, s - 1. 

2. Consider the sequences i?i,i?2,--- G [0, oo) and 0i,02,... G V(X) given 
by 0i := {xi, ...,x n }, and recursively for i > 1 by 0;+i = ■^ L - where 
Ri := mm{£(B, B') \ B, B' G 0i, 5 ^ B'}. 

3. Finally, let 6 l : [0,oo) ->■ be such that 6» f (r) := i(r) with i(r) := 
max{i | Ri < r}. 

Remark 3.1. Notice that in this algorithm we can reformulate step 1 as follows. 
Let G l R a graph whose vertices are the blocks of the partition LI £ V(X) and let 
us consider an edge joining B and B' if and only if £(B, B') < R. Therefore, 
B B' if and only if B,B' are in the same connected component of G l R . This 

reformulation allows us to use other properties of the graph G R , along with the 
connected components, to introduce further conditions to merge the blocks. See 
section^ 

4. Single linkage hierarchical clustering 

In this section we recall some basic properties and the characterization of SL 
HC from (4]. We analyse the properties involved and propose some alternatives. 
Our intention is to study which are the common properties of SL and SL[a) and 
which are the properties that these methods do not share. 
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4.1. Properties of SL. Let us recall the definition of Gromov-Hausdorff distance 
from |T]. See also [BJ. 

Let (X, dx) and (Y,dy) two metric spaces. A correspondence (between A and 
B) is a subset R £ A x B such that 

• Vse i, there exists 6 £ B s.t. (a, b) £ R 

• VfceB, there exists a G A s.t. (a, b) £ R 

Let 7£(yl, _B) denote the set of all possible correspondences between A and B. 
Let Tx.y ■ XxYxXxY^R+ given by 

(x,y,x',y') H> |dx(:c,a/) - dy(y,?/)l- 

Then, the Gromov-Hausdorff distance between X and Y is: 

d ffw (X,y) := i inf sup T x .y{x, y, x', y'). 

A HC method 3 is stable in the Gromov-Hausdorff sense (see (4j) if for any pair 
of finite metric spaces (X,dx), (Y,dy), 

dgn(X,Y) > dgn^X)),^^))). 

A hierarchical clustering method is said to be permutation invariant if it yields 
the same dendrogram under permutation of the points in the sample. 

Proposition 4.1. [3] SL HC is permutation invariant and stable in the Gromov- 
Hausdorff sense. 

4.2. Characterization of SL. Carlsson and Memoli provided the following ax- 
iomatic characterization of SL HC: 

Theorem 4.2. [4, Theorem 18] Let 3 be a hierarchical clustering method s. t. 

(I) 9f({p,«},( * j)) = (W'(o °))*>raH*>0. 
(II) Given two finite metric spaces X,Y and (j>: X — > Y such that dx(x,x') > 
dy (4>(x), <fr(x')) for all x,x' £ X, then 

u x (x,x') > u Y (<f>(x),(t>(x')) 

also holds for all x,x' £ X, where $s(X, dx) — dx, v(@x) = Ux an d 
$(Y,d Y )=9 Y ,r ] (9 Y )=u Y . 
(Ill) For any metric space {X,d), 

u(x, x ) > sep(X, d) for all x =/= x £ X 

where 5(A, d) = 9 and T)(6) = u. 

Then, Q is exactly single linkage hierarchical clustering. 

Notation: For any HC method 3 and any finite metric space X, let us denote 
3(X) = 9x and i](9) — ux- If there is no need to distinguish the metric space we 
shall just write ^s(X) = 9 and rj(9) = u. For the particular case of SL HC let us 
denote 3 S l(A) = 9 SL and t](9 S l) = u SL . 

Notation: Given two metrics d,d' defined on a set X, let us denote d < d' if 
d{x,x') < d'(x,x') Vi,s' £ X. 

The following propositions follow immediately from the proof of [2] Theorem 18]: 

Proposition 4.3. 7/9 holds conditions II) and III), then u > usl- 
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This is, the effort to join two points is at least the effort of the minimal chaining 
between them, ft is readily seen that if u > ugL, then S holds 111). 

Proposition 4.4. 7/3 holds conditions I) and II), then usl > U. 

In fact, Proposition |4.4| can be improved introducing the following condition. 
Al) Let (Y, d) be a metric space and X C Y. If i : X — >• Y is the inclusion map, 

then u(x,x') > u(i(x),i(x')). 
This is, by adding points to the space we may make the ultrametric distance 



smaller but never bigger. Clearly, II) Al). The proof of Proposition 4.4 (see 
|4J) can be trivially adapted to obtain the following. 

Proposition 4.5. 7/3 holds conditions I) and Al), then usl > u- 

Another natural condition to ask on a hierarchical clustering method is leaving 
invariant any ultrametric space: 

A2) If (X, d) is an ultrametric space, then u(x,y) = d(x,y). 
This is, applying the hierarchical clustering method to an ultrametric space we 
obtain the same ultrametric space. 

Also, it can be readily seen that SL HC holds A2): 

Proposition 4.6. If (X,d) is an ultrametric space, then usl{ x tII) — d{x,y) for 
every 

Proof. By definition, it is clear that usl{x, y) < d(x, y) for every x, y € X. 

Let us see that, if (X,d) is an ultrametric space, then ushix^y) > d{x,y). 
u SL(x,y) = mf{t | there exists a i-chain joining x to y}. Suppose u$L{x,y) = t 
and let x = xq, x\, x n — y a t-chain joining x to y. By the properties of the ul- 
trametric, d(xi—i, a;i+i) < max{d(xi-i, Xi), d(xi, a^+i)} < t for every 1 < i < n— 1. 
Therefore, d{x,y) < t and usL{x,y) > d(x,y). □ 

Richness property for HC methods can be defined in the same way Kleinberg 
did for standard clustering. Thus, a HC method satisfies richness property if 
given a finite set X, for every 9 € T> (X) there exists a metric dg on X such that 

f(X,dg) = 0. 

Corollary 4.7. ^sl satisfies richness property. 

It is trivial to check that A2) =>- I). Therefore, we obtain also the following 
corollary. 

Corollary 4.8. Let 3 be a hierarchical clustering method s. t. 
AO) u > usl- 

Al) Let (Y, d) be a metric space and X C Y . If i: X — > Y is the inclusion map, 

then ux(x,x') > uy(i{x), i(x')). 
A2) If(X,d) is an ultrametric space, then ux(x,y) = d(x,y). 
Then, 3 is exactly SL HC. 

5. Chaining effect 

The chaining effect is usually mentioned as one of the problems to solve in 
clustering. However, there are different approaches to define "chaining effects". 

In [4], the authors refer to the chaining effect from [5] which is the one defined 
by Williams, Lambert and Lance in [TT]. This version of the "chaining effect" takes 
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account on the tendency of a group to merge with single points or small groups 
rather than with other groups of comparable size. Thus, in they study and 
measure it by comparing the cardinal of the groups. 

Herein, we are focussing on another aspect. We want to deal with the tendency 
of two groups to be identified when they are linked by some e-chain while if we look 
to their density distribution we find that the groups have dense cores which appear 
to be at a distance much greater than e. See figures [T] |2]and[5j This is, typicaly, 
the chaining effect one finds in SL HC. 

First of all, we would like to give some concrete element to evaluate how sensitive 
is the method to this effect. To do this, we define the concept of chained subsets 
and subsets chained through smaller blocks. 

Definition 5.1. Let X be a finite metric space. We say that two (U)- connected 
subsets of X , Bi,B2, are (tj ,t{) -chained subsets if they hold that 

i) min{< | x ~ t y Vx, y £ B x } = U, 

ii) there exist xq £ B\ and yo £ B2 such that d(xo,yo) = tj < t% and 
V(xo,Vo) 7^ {x,y) £B x x B 2 , d(x,y) > U. 

If the parameters tj,ti are not relevant, we say simply that B X ,B 2 are chained 
subsets. 

It is not difficult to imagine many examples (clearly, not everyone!) containing 
two chained subsets, B\,B 2 , in the conditions above for which the desired hier- 
archical clustring method 9 should detect the clustering {Bi,B 2 } at some point. 
More precisely, 5s(X) should be able to yield a dendrogram 9 such that, for some 
t, 9 t (X) = {B 1 ,B 2 }. 

Definition 5.2. Let X be a finite metric space, Bq,...,Bi~ be (tj)- connected subsets 
of X and tj < fy. We say that B and Bf., are (tj,t»)- chained through a-smaller 
blocks if the following conditions hold 

i) min{i | x ~ t y Vx, y £ B } = U, 

ii) there exists a tj-chain XQ,...,Xk with x s £ B s for every s = 0,fc and 
V(x,y) £ B x B k , d(x,y) > t { . 

iii) ka ■ #(B S ) < min{#(Bi), #(B k )} for every 1 < s < k - 1. 

If the parameters tj, U, a are not relevant, we simply say that B\, B 2 are chained 
through smaller blocks. 

Definition 5.3. Let 3 be a HC method and ^s(X) = 8. We say that 3 is strongly 

chaining if for any set X , any pair of chained subsets B\, B 2 of X and any t > 0, 
if B\ is contained in some block B of 6(t), then yo £ B. 

Definition 5.4. Let 3 be a strongly chaining HC method and $s(X) — 9. We say 
that 3 is completely chaining if for any set X , any pair of components Bq, B k of 
X chained through smaller blocks and any t > 0, if Bq is contained in some block 
B of 9(t), then {xq, x k } £ B. 

Example 5.5. Consider the graph represented in Figure [TJ 

Suppose the edges in Ni,N 2 have length t^ and the rest have length ti 2 with 
U 1 < ti 2 . As we can see in the picture, there are 8 (U^-components, two of them, 
N± and N 2 , have four points and the rest are singletons. The whole space is (ti 2 )- 
connected with d(xi,x 2 ) = d(y 1 ,y 2 ) = t i2 > t ix . 

In particular, Bi and B 2 are (t i2 ,t i2 )- chained subsets. 
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Figure 1. £?i and i?2 are (tj 2 , ij 2 )-chained subsets. 



Example 5.6. Consider the graph represented in Figure^ 

Suppose that the edges in Ni,N 2 have length t ii; the edges {xq,Zq} and {^Oi2/o} 
have length t, l2 and the rest have length ti 3 , with t^ < ti 2 < ti 3 . As we can see in 
the picture, there are 9 (ti t )- components, two of them, Ni and N2, have four points 
and the rest are singletons. The whole space is (U 3 )- connected. 

Taking a = 1, B\ and B2 are (U 3 ) -components (ti 2 )-chained through the 1- 
smaller block {z Q }. 

Remark 5.7. It is immediate to check that SL HC is strongly chaining. Moreover, 
given any pair of (tj,ti)- chained subsets B\,Bi such that X — B1UB2 with tj < ti, 
{xoi 2/0} is contained in some block of 9si,(tj) while B\ is not contained in any block 
of 6sl($) for any t G [tj,ti). In particular, 6{t) does not refine {Bi,B 2 } for any 
t >tj. Also, SL is completely chaining. See Theorem 5.8 



Theorem 5.8. Let 3 be a hierarchical clustering method. If for every metric space 
X and every x, y,z,t £ X , usl(x, y) < usl(z, t) implies that u(x, y) < u(z, t), then 
3 is completely chaining. In particular, SL HC is completely chaining. 

Proof. First, let us see that 3 is strongly chaining. Consider two (tj, ^-chained 
subsets B\,Bi. By hypothesis, there exist x$ G B\, y$ G Bi such that usl(%o, Vo) = 
tj. Also, there exist x\,X2 G B\ with usi,(x\,x%) = U > tj. Thus, u(x\,X2) > 
u(xq,Vo). 

If Bi is contained in some block B of 6(t), then t > u(x\,X2) > u(xo,yo) and 
Vo G B. 
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FIGURE 2. Components B\ and £?2 are chained through the 
smaller block {zq}. 



Let Bq, Bk two (^)-connected subsets (ij, ti)-chained through smaller blocks. Let 
xq, Xk be the corresponding chain. Then, usl{x t , x s ) < for every 1 < r, s < k 
and there exist x, x' £ _Bo such that usl(x, x') — U > tj. Thus, u{x, x') > tt(x r , a; s ) 
for every 1 < r, s < k. 

Now, suppose t > such that i?o is contained in some block B of (9(t). Then, 
t > u(x, x') > u(x r , a; s ) for every 1 < r, s < k and {xq, 2^} € B. □ 



^4L and CL HC are not necessarily strongly chaining. 

Example 5.9. Consider the graph from Figure ^ Suppose that, in addition, we 
include edges of length ti 2 from x\,x 2l x 3 to every vertex in N\ and from 2/1,2/2)2/3 
to every vertex in _/V 2 . Also, suppose that d(xoi2/o) = <^o with < d a < t i2 and 

tit + ^0 > U 2 ■ 

Thus, every pair of points in N% (resp. N 2 ) are at distance t^, d(xi,Xj) = t- l2 
(resp. d(y l ,y j ) = t i2 ) for every i ^ j, i,j = 0,3, d(xi,x') = U 2 for every x' G A/i 
and every 1 < i < 3, d(y l ,y J ) = t l2 for every i ^ j, i,j = 0,3, d(yi,y') = t l2 for 
every y' € N 2 and every 1 < j < 3, d(xQ,yo) = d and d(x,y) > t i2 for every 

(xo,yo) 7^ (x,y) e £1 x B 2 . 

Then, B\ and B 2 are (do, ti 2 )- chained subsets. However, = Ochitit) = 

{{xi}, {x 2 }, {x 3 }, Ni,N 2 , {yi}, {2/2}, {2/3}} and 6 AL {t l2 ) = e CL (U 2 ) = {B U B 2 }. 
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6. a-UNCHAINING SINGLE LINKAGE HIERARCHICAL CLUSTERING: SL(a) 

Given a finite metric space [X, d), let F t (X) be the minimal flag complex whose 
vertices are the points of X and two vertices v,w are connected by an edge if and 
only if d(v,w) < t. Recall that a simplicial complex K is a flag complex if A is a 
simplex of K whenever [v, w] belongs to K for all v,w £ A. 

We define a modified single linkage hierarchical clustering method, SL(a), on 
the basis of SL HC introducing a parameter a£N. This method allows us to take 
account of the density without introducing any other input in the algorithm apart 
from a and the distances between the points. 

Given a finite metric space (X, d), let 3sL(a) — Oa denote the dendrogram 
defined defined by SL{a) as follows: 

Let X = {xi, x n }. Let dij := d(xi,Xj) and consider (D,<) the ordered 
set of all possible distances: "<" represents the order of the real numbers and 
D := {ti : < i < m} = {d^ : 1 < i,j < n} with ti < tj V i < j. 

Clearly t = 0. Let 0„(O) := {{xi}, {x n }} and a (t) := 6 a (0) Vi < h. Now, 
given 9[ti-\,ti) — 9(ti-±) = {B%, B m } 7 we define recursively 9 a on the interval 

[ti, ti+l)- 

1) Let be a graph with vertices V(G^) := {B x , B m } and edges S{G 1 ^) := 
{Bi,Bj} such that the following conditions hold: 

i) min{d(x, y) \ x £ B i: y £ Bj} < ti. 

ii) there is a simplex A £ F u (B, U Bj) such that AnBj^U, An5j-^0 
and a ■ dim(A) > mm{dim(F ti (Bi)),dim(F ti (Bj))}. 

We may abuse of the notation and write B^ to refer both to the block 
of 0(tj_i) and to the vertex of G^. 

2) Let us define a relation, ~ti,a as follows. 

Let A £ cc(G^), the set of connected components of the graph G^, with 
A = {Bi ±) ...,Bi r }. Consider the subgraph H a (A) C A whose vertices are 
the blocks Bi j £ A such that 

(1) ra-#(B t] )> max{#(i? 4l )}. 

l<Z<r 

Then, Bi k ~t 4)Q -Bj , if one of the following conditions holds: 

iii) 3C £ cc(Ha(A)) such that B ik ,B ih , £ C. 

iv) B lk £ C £ cc(H a (A)), B ik , £ A\H a (A) and for every G ^ C £ 
cc(H a (A)), Bi k , is not connected in G^\G to C . 

This is, a small vertex Bi k is related to the component of big vertices 
G if every path in G^j from B ik to any other component C crosses G. 
Then, ~t ii0! induces an equivalence relation whose classes are contained 
in the connected components of G^ . 

3) For every t £ \t h t i+1 ), 6 a {t) := Q (ti-i)/ ~t 4>a - 

Condition i) is the condition used in SL HC to define the graph. See Remark 
1531 

Condition ii) is used to take account of the chain effect between two adjacent 
blocks. The idea is that if we have two adjacent blocks, densely packed, which are 
close to each other as sets but whose dense centers are more distant, this condition 
will delete the edge defined in i). 

Now consider the connected components of the graph G*J . The first idea could 
be to merge the connected components of this graph as we saw above at Remark 
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|3.1| in the case of SL, AL and CL HC. Instead of this, here we want to distinguish 
the case when to blocks are chained by isolated points or small blocks which might 
be considered as noise in the sample. See. for example, the point zq in Figure [2] 

To treat this effect we are going to focus in the "big" blocks. The selection is done 
depending on the parameter a (which defines the sensitivity of the whole method 
to chaining) and, proportionally, on the number of points involved. We use (|T|) to 
fix the distinction between big blocks and small blocks. 

By Hi), if to big blocks, B, B', are joined by and edge in G^, then B ~ a ,t; B'. 
This defines an equivalence class of big blocks. By iv), if a small block is connected 
by chains of small blocks to two different classes of big blocks we will consider it as 



a block appart until the next distance tj+i. See Example 6.1 




FIGURE 3. Graph with three connected components of big 
blocks, Gj, and six small blocks Bj. 

Example 6.1. Suppose A £ cc(G^) is as represented in Figure^ H a (A) has three 
connected components, Ci, i = 1,3 and A\H a (A) consists of six small blocks Bj, 
j = 1,6. The components Cj are merged by Hi). The edges in the figure represent 
the resulting edges from G^ after identifying the components Ci by Hi). 

Then, by iv), since every path from B\, B2 to G2 and to G3 crosses C\, B\,B 2 
are merged with C\. Since every path from B 5 to G\ and to G3 crosses C2, B 5 
is merged with C%. The rest of the small blocks can be joined to two different 
components C'k, Ck> indenpendently and, therefore, they stay as independent blocks 
in 9 a (U). Hence, 9 a {U) = {{Gi UBiU B 2 }, {G 2 U B 5 }, {G 3 }, B 3 , B 4 , B 6 }. 
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Remark 6.2. At step Hi), if H a (A) is connected, then Bi 1 U • • • U Bi r defines a 
block of 9 a (ti). 

Remark 6.3. Notice that if two points x,x' belong to the same block of Osl(U) 
then, necessarily, there exists at least a U-chain, x — Xq, X\, x n = x' joining 
them so that if Xj € Bj £ 9(ti_i), j = 0,n, the corresponding edges {Bj_i, Bj}, 
j = l,n, hold condition ii). This is immediate by construction. 



Vo 



ti! *b tiB 



Bipy 



Vo 



ti! ti3 V 



FIGURE 4. Dendrograms produced by SL(a) for the graph in Fig- 
ure [l] X\ , and the graph in Figure [2] X^ ■ 

Let us look again at the sets from Example |5.5| 

Example 6.4. Let X\ be the graph from Figure [7] and let a = 1. Then, let us 
check that applying SL(1) on X\ the dendrogram generated is the left one from 
Figure^ Oi(t) = {{x }, {x 6 }, {y }, ...,{y 6 }} ift < t i± . It t h < t < U 2 , O^t) = 
{{aci},{a;2},{a;3},JV 1 ,iV2,{j/i},{j/2},{y3}}}. There are eigth (Uj- components, six 
of them are singletons and two of them, Ni, N2, with #(Ni) = #(A^) = 4. Fur- 
thermore, xq G Ni, yo € N2 and dim F tl (N s ) = 3 for s = 1,2. 

Now, for t — t i2 , Bi (respectively B 2 ) is merged into a single block. How- 
ever, condition ii) induces a separation of the blocks Ni,N2- Therefore, 6\{ti 2 ) = 
{B 1 ,B 2 }. 

Modifying the parameter a we can adjust the method to be more or less sensitive 
to the chaining effect. Increasing a we would need higher dimensions in F ti2 {N s ) 
to apply condition ii). Suppose a > 3. In this case, Bi,B 2 would be merged for 
t = t l2 . Thus, for a > 3, 9 a (t) = {{x }, {x 6 }, {y }, {y 6 }} ift < t n , 6>i(t) = 
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{{x 1 } 7 {x 2 },{x 3 },N 1 ,N 2 ,{y 1 },{y 2 },{y 3 }}} if t n < t < t l2 and 6 a (t) = {X} if 
t > tu ■ 



(X,d) 




« 



' • J' :• 
• : . • y 



Figure 5. For the set of points in the figure appears to be a 
natural clustering {B%, B 2 , B 3 }. 



Example 6.5. Let X 2 be the graph from Figure^and let a = 1. Then, let us check 
that applying SL(1) on X 2 the dendrogram generated is the right one from Figure^ 
As we saw in the previous example, 9\{t) = {{xq}, {xq}, {zq}, {yo}, {ye}} if 
t<t tl . Itt n < t < t l2 , Oi(t) = {{xi},{x 2 },{x 3 },N 1 ,{z },N 2 ,{y 1 },{y 2 },{y 3 }}}. 
There are nine (t^)- components, seven of them are singletons and two of them, Nx, 
N 2 , with #(iV x ) = #(N 2 ) = 4. Furthermore, x a G Ni, y a G N 2 and dimF tl (N s ) = 
3 for s= 1,2. 

For t — ti 2 , conditions i) and ii) induce edges between N\ and {zq} and between 
{zq} and N 2 . However, N\ 7^ 1 N 2 by conditions Hi) and iv). In fact, for every 
ti t <t<t i3! 0i(t) = {{x 1 },{x 2 },{x 3 },N 1 ,{z o },N 2 ,{y 1 },{y 2 },{y 3 }}. 

For t = ti 3 , let us assume that 2(ti 1 + t{ 2 ) > tj 3 . Then, again by conditions Hi) 
and iv), 6i{U 3 ) = {B u {z }, B 2 }. 

Example 6.6. Consider the set represented in Figure^ Let us suppose that there 
are three distances, < t,- l2 < ti 3 which are represented, respectively, by a short 
segment, a dots line and a thick long segment. Let us assume that the sets B\, B 2 , B 3 
are (tj 3 )- connected and that d(Bk, Sfe+i) = U 3 , k = 1,2. Also, dim{F ti2 {Bk)) > 3 
for 1 < k < 3. In Figure [#| we show the corresponding dendrograms for SL(1) 
and SL assuming dim(F t . (B^)) = 3. Notice that SL = SL(a) for any a > 3. 
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e=e„ a>3 



Figure 6. Dendrograms 6*i and 6 a with a > 3 for Example [5] 



It is clear that, SL HC generates a dendrogram where it is impossible to detect 
the clustering {Bi, B2, B3} because of the chain effect. Introducing the parameter 
a = 1, in this particular case, we obtain a hierarchical clustering which is consistent 
with the distribution of the sample. 

It may be noticed that SL(a) does not detect the possible clustering {£?i,i?2} 
in the graph represented in Figure [7] nor the possible clustering {B\, {z^}, B2} in 
the graph represented in Figure [8] See examples below. This illustrates the fact 
that our method does not take account of the properties of the whole distribution 
for a certain ti. Instead of doing that, we focus on the relations between the blocks 
from 0(tj_i). 

Example 6.7. Let X± be the graph from Figure^ where every edge has length 1. 
Then, X\ has two (1, \)-chained subsets. Let us fix a = 1. 

If t < 1, 0x(t) = {{x },...,{x 3 },{y },...{y 3 }}. If t = 1, in ii) we consider 
the dimension of the complexes defined from the blocks in to which are singletons. 
Therefore, every edge in the graph defines an edge in G\. 

Since G\ is connected and all the blocks defining its vertices are singletons, 
H 1 (G\) = G\ and9 1 {l) = {X 1 }. 

For a>\ also SL{a) = SL(l) = SL. 

Example 6.8. Let X 2 be the graph from Figure [S| where every edge has length 1. 
Let us fix a — 1 . 
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Figure 7. The blocks B l7 B 2 are (1, 1) -chained subsets with no 
dense cores. SL(a) does not separate them. 




Figure 8. SL(a) does not separate the components B\, B 2 which 
are chained through a smaller block but have no dense cores. 



If t < 1, 6>i (f) = {{x },...,{x 3 },{z },{y },...{y 3 }}. Ift =1 , in H) we con- 
sider the the dimension of the complexes defined from the blocks in to, which are 
singletons. Therefore, every edge in the graph defines an edge in G\. 
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Since G\ is connected and all the blocks defining its vertices are singletons, 
H 1 (G\) = G\ and9 1 (l) = {X 2 }. 

Fora>\ also SL{a) = SL(1) = SL. 

7. Basic properties of SL(a) 

Here, we check on SL(a) the properties seen at Section [4] We find that it 
shares with SL most of them except A2) and being stable in the Gromov-Hausdorff 
sense. Small changes on the distances may affect to the dimension of the flag 
complex and to weather or not condition ii) applies. Also, they may affect to the 
size of the components and yield very different graphs G£j . Furthermore, changing 
the parameter a we may obtain a very different dendrogram. However, all the 
instability is produced by the unchaining conditions ii), Hi) and iv). Thus, we can 



use the relation stated in Proposition 7.5 to compare 9 a with 9. This allows us 



at least, to keep track of the undesired effects on the stability introduced with the 
parameter a and the unchaining conditions. Also, if we increase the parameter a, at 



some point we obtain that SL{a) = SL. See Proposition 7.2 Thus, in practice, it 
may be interesting to study the whole family of dendrograms obtained considering 
all values for the parameter. 

The following result is clear from the definition. 

Proposition 7.1. SL(a) is a permutation invariant algorithm. 

Proposition 7.2. Let X = {x\, ...,x n } be a finite metric space. If a > , then 
Z SL (X)=Z SL{a) (X). 

Proof. We know that 9 SL (t ) = 9 a (t a ). Suppose 9 SL (t i _ 1 ) = ce (t i _ 1 ). 

First, let us see that for a > condition i) already implies ii) and the 

edges of the graph G^j are those defined by condition i). Let B\,B 2 two 
connected subsets such that mm{d(x, y)\x £ B±, y £ B 2 } < U. For any simplex A, 
a-dim(A) > a and since a > ^=^, a-dim(A) > a > min{#(i? 1 ) - 1, #(B 2 ) - 1} > 
mm{dim(F u {B 1 )),dim(F u (B 2 ))}. 

Let A be any connected component of with r blocks. If the subgraph H a {A) 
is not connected, then there are at least three blocks B^, Bi 2 , Bi 3 in A, such that 
1 < #(BiJ < ^max 1 < i < r {#( J B i )} and #(B 42 ), #(S J3 ) > i ^max 1 <,< P {#(S,)}- 
In particular, r > 3 and hence, there is a contradiction since 1 < E ^ < "~ 2 = §• 



Thus, H a (A) is connected and, as we saw in Remark 6.2 all the blocks in A are 



identified. Therefore, 6 a (U) = 0sl(U)- D 

Unfortunately, most of the good stability properties of SL do not hold for SL(a). 
SL(a) is not stable in the Gromov-Hausdorff sense under small perturbations of 
the distances. 



Example 7.3. Let (X,d) be the graph from Example 6.7 (see Figure wh 



lere 



every edge has length 1 and let [X 1 ,d') the same graph where d(xo,yo) = 1 + s for 
some e and the rest of the edges have length 1. Let 9\ = ^ssl{i){,^) an d @'i(X) = 
3sl(i)(A'). 

As we saw above, B\{t) = {{x }, {x 3 }, {y }, ...{y 3 }} if t < 1 and 9 1 (1) = 
{Xi}. Thus, if r)(6i) — u if follows that u(x,y) — 1 Vx,y e X%. 

If we apply SL(1) to X' we obtain that 9[(t) = {{x' Q }, {x' 3 }, {y' }, ...{y' 3 }} if 
t< \ and 9[(t) = {B[, B' 2 } forl<t<\+e. Forl + e<t<2, there is no 
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between B[ and B 2 by condition ii). Thus, 9[(t) — {B[, B' 2 } for l+e<t<2 + e. 
For t > 2 + e, 9[(t) = X' . Thus, if r){9[) = u' if follows that u^x'^x^) = 1 
V^,i; € B[, ui(yi,y>) = 1 Vj^ G B' 2 and u 1 \x' i: y' } ) = 2 + e V(z^) G B[xB' 2 . 

In this case, d gH ((X, d), (A', d')) = § and dg n ((X. /U ),(X' ,u')) = i±£. T/iere- 
fore, SL(a) is not stable in the Gromov-Hausdorff sense. 

Also, it is unstable under the change of the parameter a. 



Example 7.4. Let [X',d!) be the graph with the metric defined in Example 7.3 

As we just saw, if rj(8' 1 ) — u' if follows that u'ix'^x'j) = 1 Vx^Xj G B[, 
u'(y' l ,y' J ) = l\/yi,y> ■ e B> 2 and u' (x^y*) = 2 + e V (x>,y>) e B[ x B' 2 . 

If we apply SL(3) to {X',d') we obtain that 9' 3 (t) = {{x' }, {x' 3 }, {y' Q }, ...{y' 3 }} 
ift< \ and 6' 3 (t) = {B[,B' 2 } for 1 < t < 1 + e. For 1 + e < t < 2, since a = 3 
there is an edge between B[ and B' 2 . Thus, 9' 3 (t) = {X'} for 1 + e < t. Hence, if 
ri(9' 3 ) = u" if follows that u"^^) = 1 V ' x' t ,x'^ G B[, u'^y'^) = 1 V^.yJ G B' 2 
and = 1 + e V E B^ x B 2 . 

Therefore, d gn {{X, u'), {X' , u")) = \. 

Notation: Let X be a finite metric space. Let us recall that ^SsL(a) = #a(A) 
or just # Q . Let us denote rj{9 a ) = u a . 

Proposition 7.5. wsl < u a for every a G N. 



Proof. As we saw at Remark 6.3 if two points x' G A belong to the same block 
of 9 a (t), they belong, in particular, to the same i-component of X and, therefore, 
to the same block ol9gi,{t). Thus, usl(x,x') < u a (x,x / ). □ 

We could have simply stated that ^sL(a) holds AO). 

Proposition 7.6. // (X, d) is an ultrametric space, then 9 a — 9$l for every a. 

Proof. By definition, 9 a (t ) = ^sl^o)- Suppose 6 a {U-i) = Osl{U-i) = {B ll ...,B n }. 
Let us see that 9 a (ti) = 9sL(ti). 

Let Bi, Bj be such that mm{d(x,y)\x G Bi, y G Bj} < U. Since Bi, Bj 
are (ij_!)-components, by the properties of the ultrametric, d{x,y) < ti for every 
{x,y) G B 1 x B 2 . 

Therefore, all the points in BiUB 2 define a simplex in F ti (BiUB 2 ) and condition 
ii) holds for every a. Thus, there is an edge defined between Bi and Bj. 

Now, let Bi,Bj be two blocks in the same connected component of G^j. Then, 
by the properties of the ultrametric, {Bi,Bj} is an edge of G^. Hence, iJ Q (G^) 
is connected and, as we saw in Remark |6.2| 9 a (ti) is defined by the connected 



components of G„\ By 3.1 this proves that 9 a = 9$l- O 



Corollary 7.7. If (X,d) is an ultrametric space, then u a (x,y) — d(x,y) for every 
x,y G X. 

Corollary 7.8. SL(a) holds AO) and A2) but not Al). 

Corollary 7.9. SL(a) satisties richness property. 

One may wonder if given a > a' anything can be told about the corresponding 
dendrograms. In particular, given 9sl( q )(A) = 9 a and 3s,l( Q ') (A) = ^a'i if u a < 
u a i or u a i < u a . This need not be true. In fact, it may fail by conditions in) and 



iv), see Example 7.10 or by condition ii), see Example 7.11 
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FIGURE 9. A change on the parameter a may have different effects 
on the chaining through smaller blocks. 



Example 7.10. Let a > a 1 . Suppose that 9 a iti-\) = 9 a >(ti-i) = {-Bi, B 2 , -83}- 
See the example above from Figure [p[ Now, suppose that conditions i), ii) define 
edges {Bi,B 2 } and {B 2 ,B 3 } but not {B\,B^\ in both G^j and G*,. 

Suppose that maxi<;<3{#(B;)} = #(B±). Also, let us suppose that 3a-#(B 2 ) < 
3o / -#(S a ) < 3a -#(B 3 ) > but3c/-#(B 3 ) < #{B t ). In this 

case, there is a unique connected component A — {Bi, B 2 , B 3 } and H a >{A) — {Bi} 
is connected while H a (A) = {B\, B3} is not connected. Thus, 9 a (ti) = {B\, B2, -B3} 
and 6 a ,(U) = {B 1 UB 2 U B 3 } = {X}. 

Suppose that 6 a '(fi-l) — ^a'(^-i) = {B'i,B' 2 , B' 3 }. See the example below from 
Figure [£| Now, suppose that conditions i), ii) define edges {B' 1 ,B' 2 } and {B^B'^} 
but not {B[,B' 3 } in both Gfy and G%. 

Suppose that maxi<;<3{=^=(i3 ; ')} = Let us suppose that 3a • #(B' 3 ) > 

#(B[), 3a' ■ #(B' 3 ) > #(B[), 3a-#(i? 2 ) > #(£?£) but 3a' • #(B' 2 ) < #(B[). In this 
case, there is a unique connected component A = {B\, B 2 , B3}, H a (A) has vertices 
B[, B' 2 , B 3 and it is connected while H a i(A) = {B[,B^} is not connected. Thus, 
9 a (t t ) = {B[ UB' 2 U B' 3 } = {X} and 9 a , (U) = {B[,B' 2 , B' 3 }. 

Hence, even in the case when there is no chain effect between adjacent blocks, 
9 a (ti) need not refine 9 a '(U) and 9 a '(U) need not refine 9 a {ti). 

In particular, u a ^ u a > and u a > j£ u a . 
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Example 7.11. Let a > a'. Suppose 6 a (ti-\) = O a >(ti-i) = {B\, B 2 , B3, B4}, 
d{Bi,B 2 ) = d(i?3,i?4) = ti, d(Bi,B 3 ) — t i+ i and the rest of respective distances 
between these blocks are bigger than ti+i. See Figure [75| 

Since a > a' , we may assume, by condition ii), that there is an edge between 
B\,B 2 and between B^,B^ in but not in G*,. Thus, suppose 9 a {ti) = {Bq,By} 
while e a .(U) = {B U B 2 ,B 3 ,B 4 }. 

Now, we may assume that dim(F ti+1 (Bi)),dim(F t {B 2 )) < dim(F t . +1 (B e )) 
and dim(F ti+1 (B 3 )), dim{F t (B^)) < dim(F ti+1 (Bj)). Thus, we may also assume 
that, at U + i, for a' there is no edge between Bq,Bj but for a there is an edge 
between B\,B^. Therefore, 9 a '(U+i) — {BqiB-?} while 9 a (ti+i) = {-B5, B 2 , B4}. 

Hence, 9 a (ti) does not refine 9 a >{ti) and 9 a '{ti) does not refine 9 a (ti). 

In particular, it is immediate to check that u a j£ u a i and u a i ^ u a . 




Bs 



Figure 10. A bigger a does not imply a smaller ultrametric. 

8. Unchaining properties of SL{a) 

Definition 8.1. Let 3 be a HC method and ^s(X) = 9. We say that S is weakly 
unchaining for the parameter a if the following implication holds: 

Let X be a finite metric space such that X = B\ U B 2 , with B\,B 2 a pair of 
(tj,ti)- chained subsets. Suppose there exist N% G B±, N 2 € B 2 such that 

• N s is contained in some block B 3 S of 6(tj-i), s = 1,2, 

• dimF tj (N s ) > a, s = 1,2, 

• x £ Nt, y E N 2 , 
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• sup a . j3 ., eBl {d(a;,x / )} < U and sup yy , eB2 {d(y, y')} < t t . 
Then, there exists t>0 such that 6{t) = {Bi,B 2 }. 

We say that 3 is weakly unchaining if it is weakly unchaining for some pa- 
rameter a. 

Remark 8.2. Notice that in the definition above we consider two chained subsets 
with further conditions. Therefore, if a HC method, 3, is strongly chaining, in 
particular, it is not weakly unchaining. 

Theorem 8.3. Let X be a finite metric space such that X = B\ U B 2 , with B\, B2 
a pair of (tj,ti) -chained subsets. Suppose there exist Ni G B\, N 2 € B 2 such that 

• N s is contained in some block B 3 S of 9{tj-\), s = 1,2, 

• dimF t] {N s ) > a, s = 1,2, 

• x Q € Nt, y € N 2 . 

Then, 9 a (fi) refines {Bi,B 2 }. If, in addition, sxip x x , t z Bl {d(x,x')} < ti and 
s ^Py,y'eB 2 {d(y,y')} < U, then 9 a (U) = {B 1 ,B 2 }. 

Proof. Let us recall that, by definition, < tj < ti. 

For the first part it suffices to check that for every pair (x,y) £ B\ x B 2 , {x,y} 
is not contained in any block of 9{ti). this is, u a (x,y) > t^ 

Let (x, y) E B\ x B 2 . First, notice that for any t < tj, there is no t-chain joining 
x to y. Thus, u a (x, y) > tj. Let us check that u a (x, y) > tj+fe, k — 0,i — j. 

For k = 0, since xq € N\ C B^ 1 and yo € N 2 C B^ 1 , condition ii) implies 
that there is no edge in Gq between and B^ 1 . Since d{xi,y\) > ti for 

every (xo,yo) ^ (xi,yi) G B\ x B 2 there is no tj-chain joining x to y which does 
not contain the edge {xo,yo}- In particular, there is no i^-chain joining x to y 
which does not contain the edge {xq, yo}- Therefore, by Remark |6.3| it follows that 
u a (x,y) > tj. 

The same argument works for every < k < i — j. Thus, u a (x, y) > U and 8{ti) 
refines {B\, B 2 }. 

Suppose , in addition, that swp x x , eBi {d(x,x')} < ti and swp y y , eB2 {d(y,y / )} < 
ti. We already proved that 6{U) refines {Si, B 2 }. Clearly, since sup x x , &Bl {d(x, x')} < 
ti (respectively, for B 2 ), all the blocks of contained in B\ (resp. B 2 ) are joined by 
an edge in G]*. Therefore, B\ (resp. B 2 ) is a block of □ 

Corollary 8.4. SL(a) is weakly unchaining for the parameter a. 
See example |6.4| 

Remark 8.5. AL and CL HC are not weakly unchaining. 

Consider the graph in Figure^ To check that CL HC is not weakly unchaining 
suppose that we add some edges between N\ and N 2 so that V(xo,t/o) 7^ ( x >v) € 
Ni x N 2 , d{x,y) = t tl +t i2 . 

Notice that this graph holds the conditions in the definition of weakly unchaining. 
Then, it suffices to check that 9ch{i) is never {B\,B 2 }. 

It is immediate to check that £ (N\, N 2 ) = t^ +t 2 . Then, it is readily seen that 
9c'L(t) = {xi,x 2 ,x 3 ,Ni,N 2 ,yi,y 2 ,y 3 } for every t n < t < t h + t l2 and 9 C l(U 1 + 
U 2 ) = {X}. 

To check that AL HC is not weakly unchaining suppose that in Figure [TJ we 
made d(xo,yo) = d := t i2 — \t il . Let us see that 9Ab{t) is never {Bi,B 2 }. 
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First, notice that this graph holds the conditions in the definition of weakly un- 
chaining since d a < t i2 < d a +t il . Also, it is immediate to check that £ (Ni, N 2 ) = 
\U X + h — £ AL (x l ,N 1 ) = l AL (yj,N 2 ), i,j = 1,3. Thus, it is readily seen that 
9 al if) = {xi,x 2 ,x 3 ,N 1 ,N 2 ,yi,y 2 ,y 3 } for every t h < t < \t. h +t l2 and 9 A l{\U 1 + 
U 2 ) = {X}. 




Figure 11. If 6 a (U-i) holds the conditions from Definition 8.6 



with equalities on conditions c) and d), then G£ is the graph above 
and 9 a (ti) is as indicated. 



Definition 8.6. Let 3 and Q(X) = 9. 3 is a-unchaining if it is weakly unchain- 
ing for the parameter a and the following implication holds: 
Let X be a finite metric space and let 

Q[U-i) = {Bi,B 2 , {z }, {z k }, {xi}, {x n }, {y t }, {y m }} 
with Zj, x r , y s single points for every j,r, s. Suppose that 

a) d(zj_i, Zj) = ti for every 1 < j < k, 

b) d{z jl ,z ]2 ) > U for every \ji - j 2 \ > 1. 

c) d(x r , B\) <ti for every r 

d) d(y s , B 2 ) <U for every s 

e) min r . JiS {d(x r , Zj),d(zj, y s ), d{x r , B 2 ), d(y s ,Bx), d(x r ,y s ),d{B 1 , B 2 )} > U 

f) Ka < max{#(B 1 ),#(B 2 )} ; Ka-#(B 1 ) > #(£? 2 ) andKa-#(B 2 ) > #(Bi) 
with K = k + n + m + 3. 



A DENSITY-SENSITIVE HIERARCHICAL CLUSTERING METHOD. 
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Then, there exists t > such that 

9{t) = U Xl U ■ • • U x n }, z , z k , {B 2 U Vl U ■ ■ • U y n }}. 

S is unchaining if it is a-unchaining for some parameter a. 

Remark 8.7. Notice that in the conditions above, if rmn{t \ x ^ t y Vx,y £ B{\ 
U, then B\ and B 2 are (ti,U)-chained through the a-smaller blocks ZQ,...,z k . 




Figure 12. If tj — ti and 9 a (ti^i) holds the conditions from 
Proposition 8.8 with equalities on conditions c) and d), then is 
the graph above and 9 a (ti) is as indicated. 



Theorem 8.8. Let X be a finite metric space and let 

O a (tj-i) = {B 0l Bi, ...,B k -i,B kl B[, ...,B n , B'l, B'^} 

with tj < U < 2tj . Suppose that 

a) d(Bi-i, Bi) = tg for every 1 < £ < k, 

b) d{B tl ,B t2 ) > ti for every \l x - l 2 \ > 1, 

c) d(B' r , B ) < ti for every r 

d) d(B'J, Bk) < ti for every s 

e) d(B' r , Bi) > ti for every r and every 1 < I < k 

f) d(Bi, B") > ti for every s and every < I < k — 1 

g) ifamax 1 <,< fe _ 1 {#(B,)} < max{#( J B ), #(B k )}, Ka-#(B ) > #{B k ) and 
Ka ■ #(B k ) > #(S ) with K = k + n + m+ 1. 
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h) a > dim(F u {Bt)) for every 1 < £ < k — 1, a > dim(F ti (B' r )), a > 
dim(F t .(B / ! !)) for every r,s. 
Then, 

O a (U) = {{B U B[ U • • • U B' n }, B x , B k - U {B k UflfU-U B' r ' n }}. 
Proof. Let 

&a(tj-i) = {Bq, Bi, Bk-i, B k , B[, —,B n , B'{, B'^} 

holding the conditions above. For t = tj let us apply conditions i) and ii) of 
SL(a). Since, a > dim(F ti (Bt)) > dim(F tj (Be)) for every 1 < t < k - 1, we 
obtain edges {Bi_i, Bi} for every 1 < £ < k. Since a > dim(F ti (B' r )), a > 
dim(F ti (B'J)) for every r, s, we also obtain edges {B' r , Bo} and {B", Bk} for every 
r, s such that the distance is less or equal than tj. Thus, the blocks Bp, < t < k 
are in the same connected component of GJ. Since Ka maxi<^<fc_i{#(£^)} < 
max{#(i?o), #(-Bfc)}, by iv), Bg is an independent block in O a (tj) for every 1 < £ < 
k — 1. Also, by iu), the blocks joined by an edge to £?o (resp. the blocks B" 
joined by an edge to B k ) are merged with _B (resp. B^). 
The same argument holds for every tj <t ; < t,. Thus, 

0a(*<) = U fi o UBlU-U £G, Bi, {B k U Bf U • • • U S^}}. 

□ 

Renaming the corresponding blocks, it is immediate to obtain the following: 
Corollary 8.9. SL(a) is a-unchaining. 
See example |6.5| 
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